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Abstract 

We study a location-routing problem in the context of capacitated vehicle routing. The input to 
fc-LocVRP is a set of demand locations in a metric space and a fleet of k vehicles each of capacity 
Q. The objective is to locate k depots, one for each vehicle, and compute routes for the vehicles 
so that all demands are satisfied and the total cost is minimized. Our main result is a constant- 
factor approximation algorithm for fc-LocVRP. To achieve this result, we reduce fc-LocVRP to the 
following generalization of k median, which might be of independent interest. Given a metric {V, d), 
bound k and parameter p G M+, the goal in the k median forest problem is to find S ^ V with 
jS"! = fc minimizing: 

^d{u,S) + p-(i(MST(y/S')), 

u&V 

where c?(u, S) = min^.gg d{u, w) and MST{V/ S) is a minimum spanning tree in the graph obtained 
by contracting 5 to a single vertex. We give a (3 + e)-approximation algorithm for k median forest, 
which leads to a (12 + e) -approximation algorithm for fc-LocVRP, for any constant e > 0. The 
algorithm for k median forest is t-swap local search, and we prove that it has locality gap 3 + | ; this 
generalizes the corresponding result for k median [jS). 

Finally we consider the k median forest problem when there is a different cost function c for 
the MST part, i.e. the objective is J2uev ^(^' ^) + c{ MST(y/S) ). We show that the locality 
gap for this problem is unbounded even under multi-swaps, which contrasts with the c = d case. 
Nevertheless, we obtain a constant-factor approximation algorithm, using an LP based approach 
along the lines of llT2l . 

1 Introduction 

In typical facility location problems, one wishes to locate centers and connect clients directly to centers 
at minimum cost. On the other hand, the goal in vehicle routing problems (VRPs) is to compute routes 
for vehicles originating from a given set of depots. Location routing problems represent an integrated 
approach, where we wish to make combined decisions on facility location and vehicle routing. This is a 
widely researched area in operations research, see eg. surveys ||4l[T3l[T4llll[l6l[T7l- Most of these papers 
deal with exact methods or heuristics, without any performance guarantees. In this paper we present an 
approximation algorithm for a location routing problem in context of capacitated vehicle routing. 

Capacitated vehicle routing (CVRP) is an extensively studied vehicle routing problem [19| which 
involves distributing identical items to a set of demand locations. Formally we are given a metric space 
{V, d) on vertices V with distance function d : V x V ^ R+ that is symmetric and satisfies triangle 
inequality. Each vertex u £ V demands Qu units of the item. We have available a fleet of k vehicles, each 
having capacity Q and located at specified depots. The goal is to distribute items using the k vehicles 
at minimum total cost. There are two versions of CVRP depending on whether or not the demand at a 
vertex may be satisfied over multiple visits. We focus on the unsplit delivery version in the paper, while 
noting that this also implies the result under split-deliveries. 
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We consider the question "where should one locate the k depots so that the resulting vehicle routing 
solution has minimum cost?" This is called k-location capacitated vehicle routing (A;-LocVRP). The 
fc-LocVRP problem bears obvious similarity to the well-known k median problem, where the goal is to 
choose k centers to minimize the sum of distances of each vertex to its closest center. The difference 
is that our problem also takes the routing aspect into account. Not surprisingly, our algorithm for k- 
LocVRP builds on approximation algorithms for the k median problem. 

In obtaining an algorithm for A;-LocVRP we introduce the k median forest problem, which might be 
of some independent interest. The objective here is a combination of fc-median and minimum spanning 
tree. Given metric {V,d), bound k and parameter p G ]R_|_, the goal is to find 5 C F with \S\ = k 
minimizing Yliu&v '^^''^^ ^) + P ' d[MST{V/S)). Here d{u, S) = min^g^ (i(n, is the minimum 
distance between u and an S'-vertex; MST{V/ S) is a minimum spanning tree in the graph obtained by 
contracting 5 to a single vertex. Note that when p = we have the fc-median objective, and p being 
very large reduces to MST. 

1.1 Our Results 

The main result is the following. 

Theorem 1 There is a (12 + e)-approximation algorithm for k-LocVRP, for any constant e > 0. 

Our algorithm first reduces A;-LocVRP to k median forest, at the loss of a constant approximation factor 
of four. This step is fairly straightforward and makes use of known lower-bounds lEl for the CVRP 
problem. We present this reduction in Section [2] 

Then we prove the following result in Section |3] which implies Theorem [T] 

Theorem 2 There is a (3 + e) -approximation algorithm for k median forest, for any constant e > 0. 

This is the technically most interesting part of the paper. The algorithm is straightforward: perform local 
search using multi-swaps. It is well known that (single swap) local search is optimal for the minimum 
spanning tree problem. Moreover, Arya et al. JJl showed that t-swap local search achieves exactly a 
(3 + I ) -approximation ratio for the /c-median objective (this proof was later simplified by Gupta and 
Tangwongsan |7]). Thus one can hope that local search performs well for k median forest, which is 
a combination of both MST and A;-median objectives. However, the local moves used in proving the 
quality of local optima are different for the MST and A;-median objectives. Our proof shows we can 
simultaneously bound both MST and A;-median objectives using a common set of local moves. In fact 
we prove that the locality gap for k median forest under t-swaps is also (3 + |). Somewhat surprisingly, 
it suffices to consider exactly the same set of swaps from [7| to establish Theorem[2] although [7] does 
not take into account any MST contribution. The interesting part of the proof is in bounding the change 
in MST cost due to these swaps — this makes use of non-trivial exchange properties of spanning trees 
and properties of the potential swaps from [7|. We remark that the A;-median, A;-tree (i.e. choose k 
centers S to minimize d{MST{V/S))), and k median forest objectives are incomparable in general: 
Appendix |A] gives an instance where near-optimal solutions to these three objectives are mutually far 
apart. 

Finally we consider the non-uniform k median forest problem in Section |4] This is an extension 
of k median forest where there is a different cost function c for the MST part in the objective. Given 
vertices V with two metrics d and c, and bound k, the goal is to find 5 C y with |5| = A; minimizing 
J2ueV '^(^' ^) + c[MST{V/S)). Here MST(y/S') is a minimum spanning tree in the graph obtained 
by contracting S* to a single vertex, under metric c. In contrast to the uniform case c = d, we show 
that the locality gap here is unbounded even for multi-swaps. In light of this. Theorem |2] appears a bit 
surprising. Still, we show that a different LP-based approach yields: 

Theorem 3 There is a 16-approximation algorithm for non-uniform k median forest. 
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This algorithm follows closely that for the matroid median problem lfT2l . We consider the natural LP 
relaxation and round it in two phases. The first phase sparsifies the solution (using ideas from [6]) and 
allows us to reformulate a new LP-relaxation using fewer variables; this is identical to 1, 12 J . The second 
phase solves the new LP-relaxation, which we show to be integral. 

1.2 Related Work 

The basic capacitated vehicle routing problem involves a single fixed depot. There are two versions 
of CVRP: split delivery where the demand of a vertex may be satisfied over multiple visits; and un- 
split delivery where the demand at a vertex must be satisfied in a single visit (in this case we also 
assume max^gy qu < Q). Observe that the optimal value under split-delivery is at most that under 
unsplit-delivery. The best known approximation guarantee for split-delivery is a + 1 HI |2l and for 
unsplit-delivery is a + 2 lUl, where a denotes the best approximation ratio for the Traveling Sales- 
man Problem. We make use of the following known lower bounds for CVRP with single depot r: the 
minimum TSP tour on all demand locations, and ^ Y^ueV '^(^' ^) " Similar constant factor approx- 
imation algorithms |[T5l are also known for the CVRP with multiple depots which was defined in the 
introduction. 

The k median problem is a widely studied location problem and has many constant factor approxi- 
mation algorithms. Starting with the LP-rounding algorithm of [6], the primal-dual approach was used 
in ifTTl . and also local search [3|. A simpler analysis of the local search algorithm was given in Q; we 
make use of this in our proof for the k median forest problem. Several variants of k median have also 
been studied. One that is relevant to us is the matroid median problem [12J, where the set of open centers 
are constrained to be independent in some matroid; our approximation algorithm for the non-uniform k 
median forest problem is based on this approach. 

Recently Q studied (among other problems) a facility-location variant of CVRP: there are opening 
costs for depots and the goal is to open a set of depots and find vehicle routes so as to minimize the 
sum of opening and routing costs. The fc-LocVRP problem in this paper can be thought of as the k- 
median variant of ||9l- In O the authors give a 4.38-approximation algorithm for facility-location CVRP. 
Following an approach similar to [9J one can obtain a bicriteria approximation algorithm for fc-LocVRP, 
where more than k depots are opened. However more work is needed to obtain a true approximation, 
and this is where we need an algorithm for the k median forest problem. 

2 Reducing /c-LocVRP to k median forest 

Here we show that the fc-LocVRP problem can be reduced to k median forest at the loss of a constant 
approximation factor. This makes use of known lower bounds for CVRP |[8l[T5ll9il. 

For any 5 C F, let Flow(5) := | Y.u&v In ' d{u, S) and Tree(5) = d{MST{V/S)) be the length 
of the minimum spanning tree in the metric obtained by contracting S. The following theorem is implicit 
in previous work ||8l \T5\ |9l ; this uses a natural MST splitting algorithm. 

Theorem 4 (||9|) Given any instance of CVRP on metric {V, d) with demands {qu}uev> vehicle capacity 
Q and depots 5 C 1/, 

• The optimal value (of the split-delivery CVRP) is at least max{Flow(5'), Tree(S')}. 

• There is a polynomial time algorithm that computes an unsplit-delivery solution of length at most 
2 • Flow(5) + 2 • Tree(5). 

Based on this it is clear that the optimal value of the CVRP instance given depot positions S is 
roughly given by Flow(S') + Tree(S'), which is similar to the k median forest objective. The following 
lemma formalizes this reduction. We will assume an algorithm for the k median forest problem with 
vertex-weights {q^ : u G V}, where the objective becomes qu ■ d{u,S) + p-d{MST{V/S)). 
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Lemma 5 If there is a ^-approximation algorithm for k median forest then there is a 4(3 -approximation 
algorithm for k-LocVRP. 



Proof: Let Opt denote the optimal value of the /c-LocVRP instance. Using the lower bound in Theo- 
rem |4) 

Opt> min max{Flow(S'), TreerS*)} > min [e • Flow(5) + (1 - e) • TreefS*)] , 

S:\S\=k S:\S\=k 

where e G [0, 1] is any value; this will be fixed later. Consider the instance of k median forest on metric 
{V, d), vertex weights {qu}uGV and parameter p = • ^. For any S <^V the objective is: 

^qu-d{u,S)+p-d{MST{V/S)) = ^•Flow(5)+/9-Tree(5) = ^-[e • Flow(5) + (1 - e) • Tree(5)] 

Thus the optimal value of the k median forest instance is at most ^ • Opt. Let Saig denote the solution 
found by the /3-approximation algorithm for k median forest. It follows that | Saig \ = k and: 

e • f\ow{Saig) + (!-£)• Tree{Saig) < /3 • Opt (1) 

For the A;-LocVRP instance, we locate the depots at Saig- Using Theorem|4j the cost of the resulting ve- 
hicle routing solution is at most 2-Flow(S'aig)+2-Tree(S'aig) = 4-[e • F\o\N{Saig) + (1 — e) • Jree{Saig)] 
where we set e = 1/2. From Inequality ([T]) it follows that our algorithm is a 4/3-approximation algorithm 
for fc-LocVRP. ■ 

We remark that this reduction already gives us a constant factor bicriteria approximation algorithm 
for /c-LocVRP as follows. Let Smed denote an approximate solution to fc-median on metric {V, d) with 
vertex- weights {qu u ^ V}, which can be obtained by directly using a fc-median algorithm [3]. Let 
Smst denote the optimal solution to min5.|5|<fc d{MST{V/ S)), which can be obtained using the greedy 
MST algorithm. We output Su = Smed U Smst as a solution to fc-LocVRP, along with the vehicle routes 
obtained from Theorem |4] applied to Su- Note that < 2k, so we open at most 2k depots. Moreover, 
if S* denotes the location of depots in the optimal solution to /c-LocVRP then: 

• Flow(5me(i) < (3 + (5) • Flow(S'*) sincc we used a (3 + (5)-approximation algorithm for k- 
median [3|. 

• Tree(S'mst) < Tree(5*) since Smst is an optimal solution to the MST part of the objective. 
Clearly FIow(S'm) < Flow(5w) and Tree(5bj) < Tree(5mst), so: 

^ • FIow(5m) + \ ■ Tree(5bO < ^ • [Flow(5*) + Tree(5*)] < (3 + 5) • Opt 

Using Theorem [4] the cost of the CVRP solution with depots Su is at most 4(3 + 6) ■ Opt. So this 
gives a (12 + 6, 2) bicriteria approximation algorithm for /c-LocVRP, where > is any fixed constant. 
We note that this approach combined with algorithms for facility-location and Steiner tree immediately 
gives a constant factor approximation for the facility location CVRP considered in [9|. The algorithm 
in that paper ||9l has to do some more work in order to get a sharper constant. For A;-LocVRP this 
approach clearly does not give any true approximation ratio, and for this purpose we give an algorithm 
for k median forest. 



3 Multi-swap local search for k median forest 

The input to k median forest consists of a metric (F, d), vertex-weights {qu}u<^v and bound k. The goal 
is to find S" C y with \ S\ = k minimizing: 

^{S) = Y,<lu-d{u,S) + d{MST{V/S)), 
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where d{u, S) = min^„gs' d{u, w) and MST(y/S') is a minimum spanning tree in the graph obtained by 
contracting to a single vertex. Note that this is slightly more general than the definition in Section [T] 
(which is the special case when = l/pior all u G V). 

We analyze the natural t-swap local search for this problem, for any constant t. Starting at an 
arbitrary solution L consisting of k centers do the following until no improvement is possible: if there 
exists DCL and AQV\L with \D\ = \ A\ < t and <^{L \D\JA) < then L ^ L\D[jA. 
Clearly each local step can be performed in n'^^*) time which is polynomial for fixed t. The number 
of iterations to reach a local optimum may be super-polynomial; however this can be made polynomial 
by the standard method ||3l of performing a local move only if the cost ^ reduces by some 1 + pgiy(^n) 
factor. Here we omit this (minor) detail and bound the local optimum under the swaps as defined above. 

Let F C F denote the local optimum solution (under t-swaps) and F* C 1/ the global optimum. 
Note that |F| = = k. Define map r/ : F* — )■ F as r]{w) = aig miuy d{w,v) for all w G F*. 
For any 5 C y, let Med(5) := ^^^y Qu ■ d{u, S), and Jree{S) = d{MST{V/S)) be the length of the 
minimum spanning tree in the metric obtained by contracting S; so ^{S) = Med(S') + Tree(S'). For 
any C F and A C F \ F with \D\ = \A\<t we refer to the swap F - F> + A as a "(D, A) swap". 
We use the following swap construction from [7] for the /c-median problem. 

Theorem 6 (ffl) For any F,F*QV with \F\ = \F*\ = k, there are partitions {Fj}^^^ of F and 
{F*Yi=i of F* such that |Fj| = |F*| Vi G and there is a unique Ci G Fj (for each i G with 
r]{w) = Cifor all w G F* and r]~^{v) = 9 for all v G Fi\ {q}. Define set S oft-swaps with multipliers 
{a{s) : s G 5} as: 

• For any i G [P\, if\Fi\ < t then swap {Fi,F*) G S with a{Fi, F*) = 1. 

• For any i G [i], if |Fj| > t then for each a G F^ and b €z Fi \ {cj} swap (6, a) G S with 
a(.b,a) = jp^. 

Then we have: 

• E(D,A)e5 "(^' ^) • (Med(F - D + A) - Med(F)) < (3 + 2/t) ■ Med(F*) - Med(F). 

• For each w G F*, the extent to which w is added A)eS-weA ^{^^ ^) = 1- 

• For each v £ F, the extent to which v is dropped X](z) A)es-veD Q^(^) A) < I + j. 

We use the same set S of swaps for the k median forest problem and will show the following: 

a(F>,A) • (Tree(F-F) + ^) -Tree(F)) < (3 + 2/t) •Tree(F*) -Tree(F) (2) 

(D,A)e<S 

Combined with the similar inequality in Theorem [6] (for Med) and using local optimality of F, we 
would obtain the main result of this section: 

Theorem 7 The t-swap local search algorithm for k median forest is a (3 + j) -approximation. 

It remains to prove which we do in the rest of the section. Consider a graph H which is the 
complete graph on vertices V \J{r} (for a new vertex r). If F = (^) denotes the edges in the metric, 
H has edges E \J{{r, v) : v £ V}; the edges {{r,v) : t; G V} are called root-edges and edges E are 
true-edges. Let M denote the spanning tree of H consisting of edges MST{V/ F) (J{(r, v) : w G F}; 
similarly M* is the spanning tree MST{y / F*)y]{{r,v) : v G F*}. For ease of notation, for any 
subset S" C F, when it is clear from context we will use S to also denote the set {(r, u) : v G 5} of root- 
edges. We start with the following exchange property (which holds more generally for any matroid), see 
Equation (42.15) in Schrijver 1.18,1 . 
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Figure 1: The partitions used in local search proof (eg. has A; = 8 and t = 2). 



Theorem 8 ( 01811 ) Given two spanning trees Ti and T2 in a graph H and a partition {Ti{i)}^^^ of the 
edges ofTi, there exists a partition {72(«)}^^^ of edges 0/T2 such that {T2\T2{i)) |J Ti(i) is a spanning 
tree in H for each i € [p]. (This also implies \T2{i)\ = \Ti(i)\for all i € [p]). 

We will apply Theorem[8]on trees M* and M. Throughout, M* and M represent the corresponding 
edge-sets. Recall the partition := {F*Yi^^ of F* from Theorem|6j we refine by splitting parts 
of size larger than t into singletons, and let T* denote the resulting partition (see Figure [T]). The reason 
behind splitting the large parts of {F*Y-^^ is to ensure the following property (recall swaps S from 
Theorem [6]). 

Claim 9 For each swap (D, A) €z S, A (1 F* appears as a part in T* . Moreover, for each part A' in 
F* there is some swap (D' ,A') £ S. 

Consider the partition V* of M* with parts F* U{^}eeM*\F*' i-C- c^ch true edge lies in a singleton 
part and the root edges form the partition F* defined above. Let V denote the partition of M obtained 
by applying Theorem [8] with partition V* of M* ; note also that there is a pairing between parts of V 
and V* . Let M' C M n E denote the true edges of M that are paired with true edges of M* ; and 
M" = (M n E)\ M' are the remaining true edges of M (see also Figure[T]). We will bound the cost of 
M' and M" separately. 

Claim 10 Y.e&M' de < Eh^ehm* dh- 

Proof: Fix any e G M'. By the definition of M' it follows that there is an ^ G n M* such that part 
{h} in V* is paired with part {e} in V. In particular, M — e + h is a. spanning tree in H. Note that 
the root edges in M — e + h are exactly F, and so M — e + his a. spanning tree in the original metric 
graph {V,E) when we contract vertices F. Since M = MST{V/F) is the minimum such tree, we 
have d{M) — de + dh > d{M) or de < dh- Summing over all e G M' and observing that each edge 
h £ E n M* can be paired with at most one e G M', we obtain the claim. ■ 

Consider the connected components (in fact a forest) induced by true-edges of M: for each / G -F 
let C/ C y denote the vertices connected to /. Note that {Cj : f £ F} partitions V. 

Now consider the forest induced by true edges of M* (i.e. E n M*) and direct each edge towards an 
F* -vertex (note that each tree in this forest contains exactly one F* -vertex). Observe that each vertex 
V £ V \ F* has exactly one out-edge a^, and F* -vertices have none. 

For each f £ F, define Tf := {ay : v G C/} the set of out-edges from Cj. 
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Claim 11 J2feF d{Tf) = d{E n M*). 

Proof: It is clear that {Tf} j^zp partitions EnM*. ■ 

We are now ready to bound the increase in the Tree cost under swaps S. By Claim|9]it follows that 
for each swap {D, ^) G 5, A is a part in T* (and so in V*); define Ea as the true-edges of M (possibly 
empty) that are paired with the part AofV*. 

Claim 12 {Ea : {D, A) G 5} is a partition of M" . 

Proof: Consider the partition V of M given by Theorem [sjapphed to V* . By definition, M' C E n M 
are the true edges of M paired (by V and V*) with true edges of M*; and M" = {E D M) \ M' are 
paired with parts from J"* (i.e. root edges of M*). For each part vr G J"* (and also V*) let E^tt) C M" 
denote the M"-edges paired with vr. It follows that {E{Tr) : it G T*} partitions M". Using the second 
fact in Claim|9]and the definition Ea&, we have {Ea ■ {D, A) e S} = {E{tt) : vr G J^*}, a partition of 

M". m 

We prove the following key lemma. 
Lemma 13 For each swap {D, A) G S, Tree(F - D + A) - Tree(F) < 2 • X^/eD ^(^/) " d{EA). 

Proof: By Claim |9) ^ C F* is a part in V*. Recall that Ea denotes the true-edges of M paired 
with A; let Fa denote the root-edges of M paired with A. Then using Theorem [8] it follows that 
{M \ Ea \ Fa) |J ^ is a spanning tree in H. Hence Sa '■= {E n M) \ Ea is a forest with each 
component containing some vertex from F U A; for any / G F U A let C'j denote vertices in the 
component containing /. In other words, Sa connects connects each vertex to some vertex of F U A. 

Consider the edge set S'^ := Sa U/sd We will add some edges so that S'^[JN connects 
each D-vertex to some vertex of F — D + A. Since Sa already connects all vertices to FU A, it would 
follow that S'y^[j N connects all vertices to F — D + A, i.e. 

Tree(F + A - D) < d{S'A) + d{N) < Tree(F) - d{EA) + ^ d{Tf) + d{N). 

To prove the lemma it now suffices to construct a set A'^ with d{N) < "^j^d d{Tf), such that 5^ |J 
connects each D-vertex to F — D + A. Below, for any F' C y we use 5{V') to denote the edges of 5"^ 
between V and V \V' . 



Constructing N Consider any minimal U Q D such that 6 (^U/ejy C^'fj — ^'^ recall that C's are the 

connected components of Sa ^ S'j^. By minimality of U, it follows that U/eC/ ^/ connected in 5^. 
We now prove two simple claims: 

Claim 14 For any f* £ F*\Awe have rj{f*) D. 

Proof: By construction of the swaps S in Theorem[6] ■ 



Claim 15 There exists f*€F*f] yjj^u C'jj and f € U such that U/et/ ^/ contains a path between 
f and f*. 

Proof: Let any /' G U. Consider the directed path P from /' obtained by following out-edges a 
until the first occurrence of a vertex v €^ F* or v £ V \ (^U/eLf ^/)' ^^^^^ -F*-vertices are the only 
ones with no out-edge a, and {aw : w £ V} = E n M* is acyclic, there must exist such a vertex 
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V G F* U \ yJfeu C/j j- Observe that C'f C Cj for all / G D D [/; recall that Cs (resp. C's) 
are the connected components in M (resp. Sa ^ M). So P C {a^ai '• w € U/ej/C'j} C {a^ : 
e U/ei/^/} ~ U/e(/^/- Suppose that vertex v F*, then v ^ V \ (^U/ef/^/) which implies 

5 {[]f^u C/) ^ since path P C leaves U/e(/ C/- So we have w € F* Q (U/ef/ C/) and 
P C Ufe(/ ^/ is a path from /' to v. 



Consider /* and /' as given Claim 15 If /* G A then the component U/ei7 ^/ ^'a is already 
connected io F — D + A. Otherwise by Claim 14 we have rj{f*) D; in this case we add edge 



if* ■>v{f*)) to ^ which connects component U/ei/ ^ to ??(/*) GF — DCF — D + A. Now using 
Claim isl d {f*,r]{f*)) < d{f*,f') < J2feU d{Tf )nin either case, U is connected to F - F> + ^ in 
S'j^ U iV\ and cost of increases by at most Yf^jj "(F/). 

We apply the above argument to every minimal U D with 6 ^IJ/e[/ C'j^ = 0- The increase in 
cost of N due to each such U is at most Ylf&u d{Tf). Since such minimal sets Us are disjoint, we have 
d{N) < J2feD d{Tf). Clearly S'j^[_}N connects each F*- vertex io F - D + A. ■ 

Using this lemma for each (F), A) ^ S weighted by a{D, A) (from Theorem[6]) and adding, 
a{D, A) • [Tree(F - D + A) - Tree(F)] 

iD,A)eS 

< 2- ^ a{D,A)-J2d{Tf)- Yl ot{D , A) ■ d{E a) (3) 

(D,A)e5 /e-D (D,A)e5 

= 2^1 E a(A^)) •'ilF/)- 5] I 5] a(F,yl)|.4 (4) 

/eF V(D,A)65:/eD / eeAf" \(D,A)e5:eeEA / 



^ ^ /eF eGM" 



(5) 



= 2^1 + ^^ •d(FnM*)-d(M") (6) 

Above Q is by Lemma [T3| ( [4l) is by interchanging summations using the fact that Ea ^ M" (for all 
(F, j4) E 5) from Claim 1 12] The first term in Q uses the property in Theorem [6] that each / G F is 
dropped (i.e. / G F) to extent at most 1 + the second term uses the property in Theorem [6] that each 
/* e F* is added to extent one in S and Claim [T2j Finally ^ is by Claim [TT] 



Adding the inequality < d{E DM*) - d{M') from Claim 10 yields: 



Y a{D, A) ■ [Tree(F - D + A) - Tree(F)] < ( 3 + - j • d{E D M*) - d{E D M), 
since M' and M" partition the true edges F n M. Thus we obtain Inequality ([2]). 



4 Non-uniform k median forest 

In this section we study the following extension of k median forest. There is a set of vertices V with 
weights {qu}uev^ two metrics d and c defined on V, and a bound k. The goal is to find S C 1/ with 
l^l = k minimizing J2uev 9" " d{u, S) + c[MST{V/ S) ). Here MST(F/S') is a minimum spanning 
tree in the graph obtained by contracting S" to a single vertex under metric c. The difference from the 

'This is thie only place in the proof where we use uniformity in the metrics for fc-median and MST. 
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k median forest problem is tliat the cost functions for the A;-median and MST parts in the objective are 
different. 

It is natural to consider the local search algorithm in this setting as well, since local search achieves 
good approximations for both A;-median and MST. However the next lemma shows that the locality gap 
is unbounded even if we allow multiple swaps. The example is similar to the locality gap in lfT2l . 

Lemma 16 The locality gap of non-uniform k median forest with multi-swaps is unbounded. 

Proof: Fix values M > > 1. Let V = {ujj : i G [A:], j G {1, 2}}, so |y| = 2k. Define vertex- 
weights as follows: q{uk,2) = 1 and all other vertices have weight w. The metric d for the fc-median 
part is: 



Observe that for any 5 C y with \S\ = k, we have c{MST{V/S)) < M iff \S f]{ui^i,Ui^2}\ = 1 for 
all i G [k]. So the non-uniform k median forest objective is smaller than M only if |5'P|{iij^i, tii^2}| = 
1, G [k]. 

We claim that the optimal value is at most one. Consider the solution S* = It is clear that 

c{MST(y/S*)) = 0. Moreover, J2u&v ^(^) ' ^(^' '^*) ~ ^ ^^^^ vertex Uk,2 being the only contributor. 

We now claim that the solution L = {'Ui,2}i=i is locally optimal under even {k — l)-swaps. First, 
observe that c{MST{V/L)) = and X^msF 'i'(^) ' di'^J-^S*) = w with vertex ui^i being the only 
contributor. So L has objective value of w. Secondly, notice that every solution S obtained by some 
{k — l)-swap of L has either MST-objective of M or median-objective of w. Thus L is a local optimum 
and the locality gap is S> 1. ■ 

We remark that the near-optimality proof of local search in the previous section only requires the 
following consistency property between the two metrics: for any pair e, / of edges de < df =^ Ce < 
Cf. In spite of the large locality gap, we show that non-uniform k median forest admits a constant factor 
approximation algorithm via an LP approach. 

The algorithm. We make use of the following natural LP relaxation for non-uniform k median forest. 
The variables i/y denote the probability of locating a depot at v; Xuv denotes the extent to which vertex 
u is connected to a depot at v (for the A;-median part); and denotes the extent to which edge e is used 
in the MST part of the objective. Also E = (^) is the set of all edges in the metric. Define H to be the 
complete graph on vertices V |J{^} (for ^ new vertex r) with edges E |J{(r, v) : v & V}. 




if either x = y or {x, y} = {ui^2,Ui+i,i} for some i e [k — 1] 

1 otherwise 



The second metric c for the MST part of the objective is: 




if either x = y ox {x, y} = iii,2} for some i G [k] 
M otherwise 




(LP) 




(V) 



'V 



G F,?; G y 



(8) 




(9) 




\/u,v G y, Ve G S 



(10) 
(11) 
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Above §P{H) denotes the spanning tree polytope of graph H, which admits a linear description in terms 
of its edge variables; see eg. [IS]. Also (y, z) G SP(i?) corresponds to the fractional spanning tree in H 
with values on edges e € E and value on each edge (r, v). It can be checked directly that this is a 
valid relaxation of non-uniform k median forest. Moreover this LP can be solved exactly in polynomial 
time to obtain solution (x* ,y*, z*) using the Ellipsoid algorithm. 

We now describe the rounding procedure. Let X denote the instance of k median forest and LPmed = 
J2ueV "^vev '^("' ^)^™ denote the median part of the optimal LP solution. Apply Stage I of the round- 
ing algorithm in [il2J to modify variables x* to x (here y* and z* remain unchanged), with the following 
properties: 

• Set R C V of representatives with weights Wu for each u G R, which defines a new instance A4 
of non-uniform k median forest (the weights of vertices inV \ Rare zero). 

• Any solution to the new instance A4 with objective C is a solution to the original instance I 
having objective at most C + 4LPmed- 

• (x, is feasible for LP (A^). 

• Disjoint collection of subsets {V{u) C V}ueR with Ylv&viu) Uv ^ ^ for all u ^ R. 

• Collection of pseudoroots { (oj, 6j) E (^) } -^^^ with each representative in at most one pseudoroot. 

• Map a : R ^ R where a{u) lies in a pseudoroot for each u ^ R. 

• Each u £ Ris connected (under x) only to V{u) U {a{u)}. 



y 



eviu) ■ 



<4 - LP^ed- 



Now apply the LP reformulation from Stage II in |[T2]| to eliminate x-variables in LP, using the above 
structure of (x, y, z), and obtain: 



minimize 



ueR ^veviu) \ veviu) 

subject to ^ 2/„ < 1 

veviu) 

X! 2/1- + X! u > 1 

^yv<k 

{y,z)eSP{H) 
yv,Ze > 



\/ueR 



(LPnew) 

(12) 



V pseudoroots (fli, &i) (13) 

(14) 

(15) 

Vw € Ve e S (16) 



Based on the above properties, it follows that (y* , z*) is a feasible solution to LPnew with objective 
at most 4 • LPmed + c - z*, i.e. at most four times the optimal value of LP(X). The advantage of the new 
LP is: 

Lemma 17 Any basic feasible solution to LPnew is integral. 



Proof: Let (y, z) denote any basic feasible solution. The constraints from ([T2]|-([T4]) define a laminar 



family on just y variables. By a standard uncrossing argument, we can choose a maximum linearly 



independent set of tight rank constraints in ( [T5| ) to be a chain on y, z variables. Thus a maximum 
linearly independent set of tight constraints in LPnew can be described as the intersection of two laminar 
families- this is always a totally unimodular matrix, and hence [y, z) must be integral. ■ 
LPnew can be solved exactly in polynomial time to obtain an extreme point solution using the Elhp- 
soid algorithm and the approach in Jain [ 10]; by the above lemma this solution is integral. Finally using 
Lemma 3.3 in [12], any integral solution to LPnew of value L is also a valid solution to the k median 
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forest instance M of value at most 3 • L. Altogether we obtain an integral solution S* to Ai of value at 
most 12 times the optimum of LP (I). Combined with the relation between instances X and A^, we have 
S* is a valid solution to I of objective at most 16 times the optimum of Z, thereby proving Theorem|3] 
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A Example comparing /c-median, /c-tree and k median forest 

We give an example which shows that near-optimal solutions to the A;-median, A;-tree and k median forest 
problems can be very far from each other. This impUes that an approximation algorithm for k median 

forest must simultaneously take into account both the median and tree parts of its objective. (For eg. we 
cannot merely solve A;-median and /c-tree separately and take the better of those solutions.) 

The underlying metric consists of six vertices {mq, mi, 112} U{^o, vi,V2}. Let £ be a parameter that 
will be set to be arbitrarily large. The distance between any Ui and Vj (for all i, j G {0, 1, 2}) is infinite; 
d{uo,ui) = d{uo,U2) = i^, d{ui,U2) = and d{uo,ui) = d{uo,U2) = i'^, d{ui,U2) = L The 
weights of vertices are q{ui) = q{u2) = q{y\) = q{v2) = and q{uQ) = q{vo) = 1. The bound k = 4 
and parameter p = £^ for the k median forest problem. Let Smed^ Stree and Skmf denote solutions that 
are o(^)-approximately optimal for the fc-median, A;-tree and k median forest objectives respectively. We 
claim that Smed^ Stree and Skmf are mutually disjoint. 

It can be checked directly that the optimal /c-median value is £^ + £^ < 2£^. Moreover the only 
solution of value o{i^) is {ui,U2, f 1, V2}; so S'merf consists of just this solution. 

The optimal A;-tree value is £ + < 1^. For any solution F G Stree (i-e. having value o(^^)), we 
must have , Wo £ |Fn{ni,U2}| = 1 and |Fn{7;i, ?;2}| = 1. So S'jree consists of these 4 solutions. 

For the k median forest objective it can be seen that the optimal value is p-£^ ^£^ ■£ = '!£!' ; from the 
solutions {■ui, ii2, vq^v\\ and {u\,U2., ^^Oj ^^2}- Moreover, any other solution has value f2(^^); so S'fcm/ 
consists of the above two solutions. Clearly S^ed^ Stree and S\^rnj are disjoint. 
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